Divergent interpersonal neural synchronization patterns in the first, second language and interlingual communication

An accumulating number of studies have highlighted the importance of interpersonal neural synchronization (INS) between interlocutors in successful verbal communications. The opportunities for communication across different language contexts are rapidly expanding, thanks to the frequent interactions among people all over the world. However, whether the INS changes in different language contexts and how language choice affects the INS remain scarcely explored. The study recruited twenty pairs of participants to communicate in the first language (L1), second language (L2) and interlingual contexts. Using functional near-infrared spectroscopy (fNIRS), we examined the neural activities of interlocutors and analyzed their wavelet transform coherence to assess the INS of dyads. Results showed that as compared to the resting state, stronger INS was observed at the left inferior temporal gyrus, middle temporal gyrus, pre-motor and supplementary motor cortex, dorsolateral prefrontal cortex, and inferior frontal gyrus in L1; at the left middle temporal gyrus, superior temporal gyrus, and inferior frontal gyrus in L2; at the left inferior temporal gyrus and inferior frontal gyrus in interlingual context. Additionally, INS at the left inferior frontal gyrus was significantly stronger in L2 than in L1. These findings reveal the differences of the INS in different language contexts and confirm the importance of language choice for the INS changes.


Results
Behavioral results. Overall accuracy in the judgment for the understanding test was high (mean: 0.884, SD: 0.090), indicating a high level of communication quality or mutual understanding between interlocutors. The two-way repeated-measures ANOVA showed no significant effects of participant or communication condition, neither was there a significant interaction between participant and communication condition for accuracy (Ps > 0.05). These results indicated that both Participants A and C communicated successfully in all the tasks, and similar successful understanding was achieved regardless of the language contexts.
The two-way repeated-measures ANOVA for reaction time (RT) showed that there was a significant participant and communication interaction (F(2, 76) = 3.267, p = 0.044). Further simple effect analysis showed that in CE condition, the RT of Participant A in L1 (15,830.550 ± 3501.494 ms) was significantly shorter than that of Participant C using L2 (17,957.526 ± 2398.684 ms) (p = 0.031). There was a significant main effect of communication condition (F(2, 78) = 6.616, p = 0.002). Pairwise comparisons (Fig. 1) showed that participants made the true-or-false judgments with significantly shorter RT in CC condition than in EE condition (p = 0.003, M CC = 15,340.088 ms, SD CC = 3130.878; M EE = 17,454.668 ms, SD EE = 3160.224), and in CE condition (p = 0.029, M CE = 16,894.038 ms, SD CE = 3152.163). Null significance was found for the RT between EE and CE conditions (p = 1.000). There was no main effect of participant (F(1, 38) = 0.434, p = 0.514). These findings indicated the conversations were successful in information exchange but communication with L2 (English) involved needed more time for processing than that with L1.  (Fig. 3). No significant difference was observed between L1 condition and interlingual condition or between L2 condition and interlingual condition at CH13, nor were there any other significant differences at any other CHs. The data of HbR concentration were also analyzed. No significant effect of communication condition was observed for HbR concentration.

Discussion
The study examined the neural synchronization between the speaker and the listener in different language contexts. The results revealed that the INS channels varied for three conditions as compared to the resting state, suggesting the hierarchical nature of language processing and language specificity affect interlocutors' similarity of the neural patterns in verbal communications. Moreover, the INS increase was significantly stronger at the left IFG during L2 interaction than during L1 interaction, where verbal working memory might function. These findings are discussed below.
First, the cortical areas with an increase in INS showed dissimilarities for three conditions. One possible account for the discrepancies might be the hierarchical nature of language processing. Language processing involves hierarchical layers, such as, phonology, semantics and pragmatics 23 . INS increase in L1, L2 and interlingual interactions showed distinct speaker-listener engagement for different layers. Comparisons of RT between L1, L2 and interlingual contexts when making the true-or-false judgements. Shorter RT for L1 condition than for interlingual condition, shorter RT for L1 condition than for L2 condition. RT reaction time, CC L1 context, CE interlingual context, EE L2 context. *p < 0.05, **p < 0.01. www.nature.com/scientificreports/ At the phonological level, different from other conditions, communications in L2 particularly showed the INS increase at the left STG. The neural coupling between speech production and comprehension in English in this region was also observed in previous research 3,9,34,35 . Their studies adopted a one-way communication paradigm, investigating "information flow" from the brain of the sender (the speaker) to the brain of the receiver (the listener). Otherwise, ours developed it further to two-way communication, in which the interlocutors exchanged their roles of both the speaker and listener. The INS might be due to the involvement of the STG in the  www.nature.com/scientificreports/ external loop of self-monitoring as researchers suggested 3,9 . Some additional accounts might also contribute to the finding. According to Hickok and Poeppel 41 , in their dual-stream model of speech, the STG is proposed to be involved in spectrotemporal analysis. Moreover, it was proved that the STG is associated with lexical phonological encoding 42 . The STG also played an important role for phonology processing in speech production. Thus, it is not surprising to find INS increase at the left STG in L2 (English) communication, implicated as interlocutors' brain-to-brain synchronization at the phonological level. At the semantic level, we found a significant INS increase at the left MTG in L1 and L2 conditions against the resting state, at the left ITG in L1 and interlingual conditions. The findings were accounted for as productioncomprehension alignment at semantic level. The left MTG serves for lexical access 43 and sentence processing 44 . These are necessary for both production and comprehension. The left MTG was also found interconnected with many language-related cortical regions in frontal, parietal and temporal lobes with white matter pathways 45 . The INS exhibited in this region was similarly found in previous study when participants produced and comprehended naturalistic narrative speech 9 . Taken together, the INS increase observed at the left MTG implicated dyads understood each other well in both L1 and L2.

Scientific
The ITG is involved in semantic processing as well. Sharp et al. 46 have observed greater activation in the ITG when processing the meaning of single words than rotated speech baseline. It has also been suggested to be responsible for semantic storage and grammatically correct sentence discrimination 47 . In addition, meta-analyses suggested the ITG is involved in both language reception and understanding in lexical-semantic system 48 . Accordingly, the robust INS increase during L1 and interlingual conversations in the present study might signify that the dyads achieve mutual understanding at semantic level, sharing content irrespective of language forms.
At pragmatic level, only participants in L1 presented the INS increase at higher-order, extralinguistic areas such as SMA and dlPFC. Previous research indicated that the SMA was involved in emotion 49 and empathy 50 , which might be required for pragmatic context integration with semantic tasks. While, as to dlPFC, previous studies claimed it a key region for pragmatic processing. The dlPFC was activated in a range of tasks where pragmatic involvement is necessary, such as sarcastic sentences 51 and metaphors 52 . Moreover, the dlPFC was engaged in discourse coherence 53 . Our findings of INS increase at dlPFC were consistent with that in previous one-way communication of stories 3 . Pérez et al. 4 also discovered stronger INS in the alpha band in fronto-central areas in L1 than in L2. The coupling pattern in these areas between the speaker and listener in L1 might suggest higher-level processing of complex dynamic information exchange between two communicators.
To summarize, the widespread INS between interlocutors during L1 communication, covered the left SMA, dlPFC, MTG, and ITG. They were recruited for mutual understanding at linguistic levels of pragmatics and semantics. While INS was achieved in L2 at the left MTG and STG, showing similar inter-brain synchronization at semantic and phonological levels. Finally, in interlingual condition, the brain-to-brain coupling was found at the left ITG, indicating mutual understanding at semantic level. This trend might be yielded by different language codes in the experiment, Chinese as the first language, English as the second language. As previous research suggested, foreign language contexts represent a circumstance where intelligibility is blocked 4, 12 . Thus, although the behavioral statistics showed similar communication quality, the linguistic levels at which the listener synchronized with the speaker differed when language contexts changed.
Second, the INS in different language contexts exhibited language specificity to some extent. We found significant INS increase at the dlPFC in Chinse (L1) condition compared with the resting state, but no similar neural patterns in these regions in English (L2) condition as compared to the resting state. The dlPFC is overlapped with middle frontal gyrus, which was reported to have a larger involvement in Chinese character recognition than in English word recognition 24 . Other contrasts between Chinse and English provided compelling evidence similarly. For example, comparisons have found this region more involved in Chinese reading than English reading 54 . It also exhibits greater activation during handwriting for Chinese than for alphabetic 55 .
In contrast, significant INS increase was observed at the left STG in English against the resting state, but null in Chinese. Meta-analysis has found that the left STG is engaged in alphabetic languages more than in logographic ones 24,54 . It was identified as a region critical for alphabetic word reading.
As evidenced above, the disparate regions for significant INS in Chinese and English might be attributed to language specificity of the involved ones. Our findings provided tentative evidence for language specificity with hyperscanning statistics.
Third, compared with L1 communications, L2 communications elicited significantly stronger INS increase at the left IFG. The IFG was previously discovered to correlate with working memory 37 . WM helps information to be held for much longer spans during which the subject rehearses the phonological information subvocally and repeatedly to avoid losing it 56 . Assisted by WM, communicators understood each other and pushed ahead with the communication under three conditions. Abundant research has rectified the links of WM to language production and comprehension in L1 31, 32 as well as in L2 57,58 . However, WM was reported to show discrepancies in L1 and L2 59 , and there was an interplay between WM and (foreign) language proficiency 60 . Less WM resources are needed when comprehension of sentences improves with practice. L1 communicators are experts in their L1, so their L1 processing is automatic, especially when surface form and text base are involved 13 . They process L1 quickly and efficiently, and consume a small amount of cognitive resources. On the contrary, L2 interlocutors have more difficulties in processing the linguistic information under time pressure while communicating, for their inefficient lexical and grammatical knowledge of the target language 33 . This can be supported by the behavioral findings in the present study. The reaction time for true-or-false judgment in L2 is significantly longer than in L1, even though there was no significant difference in the accuracy. Furthermore, our findings at the IFG might be an indicator of INS at the extra-linguistic level, recruitment of WM resources, supporting the generative theoretical model of interpersonal verbal communication 2 . When L2 was involved, to achieve a similar understanding level, the interlocutors allocated more cognitive resources to WM and showed stronger synchronization between them than in L1. Experimental procedure. The experiment was conducted in a quiet room (Fig. 4a). Participants were initially required to complete a 5-min resting-state session as a baseline, during which they kept their minds relaxed with eyes closed, and tried to minimize movements 5 . After the resting-state session proceeded the communication session. In total, each pair was required to conduct turn-taking verbal communications under three conditions. Under condition 1, both dyadic partners (Participants A and C) communicated verbally in their native Chinese. Under condition 2, the procedure was the same except that both interlocutors exchanged information in English, L2 to them. Under condition 3, one communicator (Participant A) adopted Chinese while his partner (Participant C) utilized English, in an interlingual context. To counterbalance the sequence of three communication conditions across participant pairs, seven dyads finished tasks in the order of CC, EE, CE; the other 7 in EE, CC, CE; and the rest 6 in CE, EE, CC. After each condition, participants had a rest until they felt it was efficient for recovery and the next task.

Scientific
For each condition (L1, L2, and interlingual), two different common topics were presented, totaling six topics across all conditions. The topics included traveling and movies for the L1 condition, weekend plans and online courses for the L2 condition, and picnics and pets for the interlingual condition. Before each condition, one short example was provided to ensure participants' familiarity with the procedure of the task. After each topic, three true-or-false questions were presented on the screen for communicators to judge, which was designed to determine whether the subjects involved understood the communication well and whether different contexts ignited different difficulty levels for participants. Participants had to finish each judgement with 6 s, otherwise, it would be regarded as a wrong answer. In all conditions, participants in each dyad were required to alternate between the "speaker" and "listener" roles every 20 s (Fig. 4b). This exchange between talking and listening continued for 320 s for each topic.
Conversations began with Participant A and a cue of "ding" sound indicated which member of the dyad had to speak or to listen. When the "ding" sound rang, Participant A saw sentences appearing on the screen with a 20-s countdown, and they were required to verbalize the content on the screen. Meanwhile, Participant C saw a speaker icon on the screen and was required to listen carefully via microphones. Here we take a short practice sentence for instance. When Participant A said "Today is my sister's graduation day, and I bought some flowers for her." as cued by the screen, Participant C would hear the sentence's recording through a microphone. Then, Participant C would see "It must have cost you a fortune. It is normally very expensive. " on their screen and speak within 20 s, while Participant A heard its recording through a microphone. In the L1 condition, all materials were in Chinese, either visually or auditorily, while in the L2 condition, all materials were in English. In the interlingual condition, Participant A saw, spoke, and heard everything in Chinese, while Participant C saw, spoke, and heard everything in English.
This procedure ensured that Participant A heard (or spoke) the same content in Chinese exactly when Participant C spoke (or heard) it in English in an interlingual context without any time lags between the communicating partners' articulatory and auditory processes, as temporal process profoundly impacts the INS research 7,34 . Additionally, adopting the interpretation recorded in advance eliminated the influence of interpretation quality, avoiding a direct hearing of the dyadic participant in an interlingual context, which can significantly impact their neural activities. www.nature.com/scientificreports/ fNIRS data acquisition. A continuous wave fNIRS system (ETG-7100 Hitachi Medical Tokyo, Japan) was employed to measure cortical oxyhemoglobin, deoxyhemoglobin, and hemoglobin signals. Each participant had two optode probe sets placed on them. One probe set consisted of a 3 × 5 array of optodes with 8 emitters and 7 detectors, resulting in 22 channels. This set was positioned on the left frontal, temporal, and parietal cortices, with the middle optode of the lowest probe row and the bottom right detector placed at T7 and Tp7 respectively in accordance with the international 10-10 system. Another probe set, consisting of 2 emitters and 2 detectors with 4 channels, was used to cover the right temporal-parietal (rTPJ) region (Fig. 5a). However, this set was for other research questions and was reported in a separate paper. The top right and bottom right optodes were placed at Cp4 and Cp6, respectively, in accordance with the international 10-10 system. The distance between each emitter and detector was 3 cm. The correspondence between each channel and the international 10-10 system positions was showed in Fig. 5b. A 3-D magnetic space digitizer device (FASTRAK; Polhemus, United States) was used to capture the 3-D spatial coordinates of each optode placed on the participant's scalp. Each channel's corresponding coordinates in the Montreal Neurological Institute (MNI) was estimated using a probabilistic registration method from NIRS_SPM software. The mean 3D MNI coordinates and associated brain regions of the 26 channels were provided in Table 1.
Data from individual channels were collected at two different wavelengths, 695 nm and 830 nm at a sampling rate of 10 Hz. The changes in the oxyhemoglobin (HbO) and deoxyhemoglobin (HbR) were calculated based on the modified Beer-Lambert Law. In this study, HbO concentration changes were focused in the following statistical analysis because HbO was previously validated as the most sensitive signal to hemoglobin changes 5,39 .

Behavioral data analysis.
To investigate the quality of communication across interlocutors and communication conditions, we conducted a two-way repeated-measures ANOVA with a 2 × 3 design. Accuracy and reaction time for judging true-or-false questions were the dependent variables. The between-subjects factor was fNIRS data analysis. Data collected during the resting state and task session were analyzed by using Wavelet Transform Coherence (WTC) to identify the cross-correlation between two fNIRS time series generated by a pair of participants as a function of both frequency and time 39,63 . The HbO time series of each channel from each pair of participants were obtained simultaneously, then WTC was conducted to the two time series, generating a 2D matrix of the coherence values. Its column and row represent specific frequencies and time points. All the coherence values were converted into Fisher z-values and averaged across time series. In order to remove the high-and low-frequency noises, such as those associated with respiration (about 0.2-0.3 Hz) and cardiac pulsation (about 1 Hz), a frequency period of 10-40 s (corresponding to frequency 0.025-0.1 Hz, respectively) was selected for statistical analyses.
To test that the validation of INS increase was specific to the linguistic context, permutation test was applied. Participants were randomly grouped to form new pairs, and INS increase was recalculated. Paired t-tests were applied to the INS increase. The permutation test was carried out 1000 times to produce a distribution (F value) of all CHs, which was then compared with the original data. A false discovery rate (FDR) method (p < 0.05) was used for correction.
A paired t-test was finally performed between the individual task and resting-state session separately. A oneway repeated-measures ANOVA was conducted on the INS increase over all CHs, in which the within-subjects factor is communication condition (CC vs. EE vs. CE).
Ethical approval. The study protocol was approved by the Ethics Committee on Human Experimentation for Key Laboratory for Artificial Intelligence and Cognitive Neuroscience of Language, Xi'an International Studies University (protocol code 2021-AICNL-f002 and date of approval: 18 July 2021). The study was performed in accordance with the Declaration of Helsinki. Written informed consent was obtained from all participants.

Data availability
The data generated or analyzed during this study in this study will be shared on reasonable request to the corresponding author.